Selection of Oligonucleotide Probes for Protein Coding Sequences
نویسندگان
چکیده
MOTIVATION Large arrays of oligonucleotide probes have become popular tools for analyzing RNA expression. However to date most oligo collections contain poorly validated sequences or are biased toward untranslated regions (UTRs). Here we present a strategy for picking oligos for microarrays that focus on a design universe consisting exclusively of protein coding regions. We describe the constraints in oligo design that are imposed by this strategy, as well as a software tool that allows the strategy to be applied broadly. RESULT In this work we sequentially apply a variety of simple filters to candidate sequences for oligo probes. The primary filter is a rejection of probes that contain contiguous identity with any other sequence in the sample universe that exceeds a pre-established threshold length. We find that rejection of oligos that contain 15 bases of perfect match with other sequences in the design universe is a feasible strategy for oligo selection for probe arrays designed to interrogate mammalian RNA populations. Filters to remove sequences with low complexity and predicted poor probe accessibility narrow the candidate probe space only slightly. Rejection based on global sequence alignment is performed as a secondary, rather than primary, test, leading to an algorithm that is computationally efficient. Splice isoforms pose unique challenges and we find that isoform prevalence will for the most part have to be determined by analysis of the patterns of hybridization of partially redundant oligonucleotides. AVAILABILITY The oligo design program OligoPicker and its source code are freely available at our website.
منابع مشابه
Phylogenetic Analysis of Three Long Non-coding RNA Genes: AK082072, AK043754 and AK082467
Now, it is clear that protein is just one of the most functional products produced by the eukaryotic genome. Indeed, a major part of the human genome is transcribed to non-coding sequences than to the coding sequence of the protein. In this study, we selected three long non-coding RNAs namely AK082072, AK043754 and AK082467 which show brain expression and local region conservation among vertebr...
متن کاملAn Algorithm for Highly Speci c Recognition of Protein-coding Regions
Since absolutely reliable recognition of protein-coding regions in eukaryote genomic DNA sequences by computational methods is unattainable, most existing algorithms try to keep some balance between underprediction and overprediction. However, in experimental practice it is often su cient to have just a few protein-coding segments, but predicted with high speci city, that is, with (almost) no o...
متن کاملAn in vitro selection scheme for oligonucleotide probes to discriminate between closely related DNA sequences
Using an in vitro selection, we have obtained oligonucleotide probes with high discriminatory power against multiple, similar nucleic acid sequences, which is often required in diagnostic applications for simultaneous testing of such sequences. We have tested this approach, referred to as iterative hybridizations, by selecting probes against six 22-nt-long sequence variants representing human p...
متن کاملComparisons of substitution, insertion and deletion probes for resequencing and mutational analysis using oligonucleotide microarrays
Although oligonucleotide probes complementary to single nucleotide substitutions are commonly used in microarray-based screens for genetic variation, little is known about the hybridization properties of probes complementary to small insertions and deletions. It is necessary to define the hybridization properties of these latter probes in order to improve the specificity and sensitivity of olig...
متن کاملBIGPROBE: a computer program that predicts the sequence of long oligonucleotide probes with high reliability.
We have written a computer program, BIGPROBE, which facilitates the design of long nucleic acid probes from the partial or complete amino acid sequence of a protein. BIGPROBE relies upon information on codon usage, intercodon dinucleotide frequency, and potential probe self-complementarity. We have examined the accuracy with which the program predicts coding sequences using sample human and rat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 19 7 شماره
صفحات -
تاریخ انتشار 2003